perm filename SYNTAX.TEX[MF,DEK]1 blob
sn#742951 filedate 1984-02-22 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00009 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 % Interim specs of METAFONT start on the next page
C00004 00003 \line{{\bf Low-level \MF}\hfil as of \today}
C00006 00004 \newsection 1. The lowest level: Tokens.
C00013 00005 \newsection 2. The next lowest level: Variables.
C00021 00006 \newsection 3. The next lowest level: Expressions.
C00042 00007 \newsection 4. Macro definitions.
C00061 00008 \newsection 4. Commands.
C00062 00009 \vfill\end
C00063 ENDMK
C⊗;
% Interim specs of METAFONT start on the next page
\font\ninerm=amr9
\let\mc=\ninerm % medium caps for names like PASCAL
\font\logo=manfnt % font used for the METAFONT logo
\def\MF{{\logo META}\-{\logo FONT}}
\font\tenss=amss10 % for `The METAFONTbook'
\def\today{\ifcase\month\or
January\or February\or March\or April\or May\or June\or
July\or August\or September\or October\or November\or December\fi
\space\number\day, \number\year}
\def\newsection #1. #2\par
{\medbreak\noindent{\bf #1.\enspace #2\par}
\nobreak\smallskip\noindent}
\def\oct#1{\hbox{\rm\'{}\kern-.2em\it#1\/\kern.05em}} % octal constant
\def\<#1>{$\,\langle$#1$\rangle\,$}
\def\is{$\;\longrightarrow\;$}
\def\alt{$\;\mid\;$}
\def\andalso{\hskip5em \alt}
\def\syntaxlines#1{$$\openup1\jot
\halign{\hbox to\displaywidth{\indent##\hfil}\cr#1}$$}
\def\.#1{\hbox{\tt#1}}
\def\syntaxbreak{\noalign{\smallbreak}}
\line{{\bf Low-level \MF}\hfil as of \today}
\rightline{Beware: These specifications change daily!}
\bigskip\bigskip\noindent
This is a preliminary description of what the new \MF\ language will look
like at the lowest level. Please forgive the author for the terseness of
this document; there
hasn't been time to explain things yet. Also please remember that the
low-level language is not what \MF\ programmer will usually be writing;
it is intended as a vehicle for defining nicer high-level languages.
\MF\ users will almost always work with a set of macros and other definitions
called a ``base'' file; the {\tt PLAIN} base will be defined in
{\tenss The \MF book}, in a fashion similar to the way {\tt PLAIN} format
has been defined in {\sl The \TeX book}.
\newsection 1. The lowest level: Tokens.
\MF\ is governed by sequences of {\sl tokens}, which it gets either by
reading a file or by regurgitating a list of tokens that were previously
read. So we can understand tokens by understanding what happens when
\MF\ reads from a file. A file is a sequence of lines of text, where
each line of text is a sequence of zero or more characters. The characters
are assumed to be those of standard ASCII (codes \oct{040} through
\oct{176} in Appendix~C of {\sl The \TeX book}). Any other characters that
might appear in the file are treated as if they were spaces (code \oct{040}),
except that certain systems may make substitutions for certain characters.
[For example, the {\mc WAITS} implementation replaces codes \oct{030},
\oct{032}, \oct{034}, and \oct{035} by the respective character pairs {\tt
:=}, {\tt <>}, {\tt <=}, and {\tt >=}, because the special keys for those
two-character combinations on {\mc WAITS} keyboards are too tempting to
ignore.]
Each line of text is converted into zero or more tokens according to the
following rules, repeated until no more characters remain on the line:
\smallskip
\item{1)} If the next character is a space, or if it's a period that isn't
followed by a decimal digit or a period, ignore~it and move on.
\item{2)} If the next character is a percent sign, ignore it and also
ignore everything else that remains on the current line. (Percent signs
allow you to put comments in your file that are unseen by \MF.)
\item{3)} If the next character is a decimal digit or a period that's
followed by a decimal digit, the next token is called a {\sl numeric
token}. It is the longest sequence of contiguous characters starting
at the current place that satisfies the following syntax:
\syntaxlines{\<numeric token>\is\<digit string>\alt.\<digit string>
\alt\<digit string>.\<digit string>\cr
\<decimal digit>\is\.0\alt\.1\alt\.2\alt\.3\alt\.4\alt\.5\alt\.6\alt
\.7\alt\.8\alt\.9\cr
\<digit string>\is\<decimal digit>\alt\<digit string>\<decimal digit>\cr}
Numeric tokens are interpreted according to ordinary decimal notation;
the value of the number must be less than 4096. \MF\ converts decimal
fractions to the nearest multiple of $2↑{-16}$.
\item{4)} If the next character is a double-quote mark
(\thinspace{\tt\char`"}\thinspace), the next token is called a {\sl string
token}. It consists of all characters following the double-quote up to
but not including the next double-quote on the current line. There must be
at least one more double-quote remaining on the line, otherwise you
get an error message.
\item{5)} If the next character is a left parenthesis, a right parenthesis,
a comma, or a semicolon, the next token is that single character.
\item{6)} Otherwise the next token consists of the next character together
with all immediately following characters of the same class.
\smallskip\noindent
Rules 1--5 tell what to do for 17 of the 95 possible ASCII characters
that might be next. The most interesting rule is number~6, which depends
on a breakdown of the remaining 78 ASCII characters into 12 {\sl
classes\/} as shown in Table~1. Two characters are in the same class if
and only if they belong to the same row in the class~table.
\topinsert
$$\vbox{\halign{\hfil\tt#\hfil&\qquad#\hfil\cr
\hidewidth
ABCDEFGHIJKLMNOPQRSTUVWXYZ\char`\_abcdefghijklmnopqrstuvwxyz\hidewidth\cr
<=>:|\cr
`'\cr
+-\cr
/*\char`\\\cr
!?\cr
\#\&@\$\cr
\char`\↑\char`\~\cr
[\cr
]\cr
\char`\{\char`\}\cr
.&(see rules 1, 3, 6)\cr
,&(see rule 5)\cr
;&(see rule 5)\cr
(&(see rule 5)\cr
)&(see rule 5)\cr
"&(see rule 4)\cr
0123456789&(see rule 3)\cr
\%&(see rule 2)\cr}}$$
{\bf Table 1.}\enspace The visible ASCII characters, divided into classes.
Characters in the bottom eight rows are subject to special rules as
indicated.
\endinsert
For example, the (ridiculous) line
$$\hbox{\tt xx3.1..[[a+-bc\char`\_d.e] ]"a string \%"
<>\char`\$1."+-""" \% forget this}$$
produces 16 tokens: `\.{xx}', `\.{3.1}' (which is numeric),
`\.{..}', `\.{[[}', `\.a', `\.{+-}', `\.{bc\char`\_d}', `\.e',
`\.]', `\.]', `\.{a string \%}' (which is indeed a string),
`\.{<>}', `\.\$', `\.{1}' (another numeric token), `\.{+-}' (a string, hence
different from the other~`\.{+-}'), and `' (an empty string).
Notice that three of the spaces and two of the periods were deleted
by rule~1.
\newsection 2. The next lowest level: Variables.
But what do tokens mean? Well, numeric tokens stand for numbers and string
tokens stand for strings, but the other tokens are just arbitrary symbols
that can stand for almost anything. Let's say that tokens of the third kind
are {\sl symbolic tokens}.
Some of the symbolic tokens have predefined ``primitive'' meanings when
\MF\ begins its operations, but it is possible to change the meaning of
any symbolic token. The `\.{let}' command does this; one simply
says `\.{let}' \<symbolic token>=\<symbolic token>'.
For example, you can even make a left parenthesis
denote the same thing as `\.+', if you want to confuse everybody
who tries to read your code.
Symbolic tokens are further subdivided into two categories based on their
current meaning. If the token currently stands for one of \MF's primitives,
we shall call it an {\sl operator\/}; otherwise we call it a {\sl name}.
Thus, almost every token you can think of is initially available for use
as a name, except those that were needed to define \MF's fundamental
operations. Such pre-reserved tokens can be redefined and used as
names, if you want to use them in your own way; but you probably
won't have to, since they're generally words that don't make desirable names.
Names are used for the variables in \MF\ programs. These variables can
be structured, like arrays and records in more conventional programming
languages; for example, `\.{x32a}' might be a variable that would be written
`\.{x[32].a}' in {\mc PASCAL}. A variable identifier has the following
syntax:
\syntaxlines{\<variable>\is\<name>\<suffix>\cr
\<suffix>\is\<empty>\alt\<suffix>\<subscript>\alt\<suffix>\<name>\cr
\<subscript>\is\<numeric token>\alt\.[\<numeric expression>\.]\cr}
A \<suffix> is taken to be as long as possible; i.e., if a \<suffix> is
followed by a \<subscript> or a \<name>, the \<suffix> will be extended.
Notice the two permissible forms of subscripts: a numeric token can be
written without brackets, or it can be a bracketed expression.
For example, if \.i is a variable whose value is~7, the variable identifiers
`\.{b7}', `\.{b007}', `\.{b[7]}', `\.{b[i]}', and `\.{b[21-2i]}' are
all equivalent. On the other hand, `\.{b.007}' would be different, since
it involves the fractional subscript `\.{.007}'. Also, `\.{b.i}' would
be different; in this case the `\.i' is simply a name that appears as a suffix,
it's not a subscript.
Incidentally, the `\.[' and `\.]' that appear in the syntax for
\<subscript> stand for any tokens that have \MF's primitive meanings
for left bracket and right bracket, respectively. They aren't necessarily
brackets; indeed, if the tokens `\.[' and `\.]' have been redefined,
they no longer can be used to produce subscripts. Similar remarks
apply to all of the tokens in all of the rules below. \MF doesn't look
at the form of a token; only the current meaning is relevant.
Variables can be of many types:
\syntaxlines{\<type>\is\.{numeric}\alt\.{string}\alt\.{boolean}\alt
\.{path}\alt\.{pen}\alt\.{edges}\alt\.{transform}\alt\.{pair}\cr}
To specify a type other than \.{numeric}, you simply give a type
declaration that lists the relevant identifiers. For example, the declaration
$$\hbox{\tt pair right, left, a.zz}$$
says that `\.{right}', `\.{left}', and `\.{a.zz}' will be variables of type
\.{pair}, so that equations like
$$\hbox{\tt right = -left = 2a.zz = (1,0)}$$
can be given later. These equations, incidentally, define
$\.{right}=(1,0)$, $\.{left}=(-1,0)$, and $\.{a.zz}=(.5,0)$.
The declaration of an array variable is independent of all the subscript
values; all subscripts in the declaration are therefore given in a
special anonymous form. For example,
$$\hbox{\tt path p[], x[]arc, f[][]}$$
declares all variables of the form \.{p[i]} and \.{x[i]arc} and \.{f[i][j]}
to be of type \.{path}. This declaration doesn't affect the types of
variables like \.p or \.{p3arc}. Incidentally, \MF\ considers a declaration
like `\.{path}~\.{p3}' to be illegal, since it falsely implies that only
\.{p3} (not \.{p2}) is a path; subscripts in a type declaration must
be anonymous.
Here are the formal syntax rules:
\syntaxlines{\<type declaration>\is\<type>\<declaration list>\cr
\<declaration list>\is\<declared variable>\alt
\<declaration list>\.,\<declared variable>\cr
\<declared variable>\is\<symbolic token>\<declared suffix>\cr
\<declared suffix>\is\<empty>\alt\<declared suffix>\.{[]}\alt
\<declared suffix>\<name>\cr}
A variable that hasn't been declared is automatically of type \.{numeric},
but its value is undefined until it appears in an equation. Declarations
destroy all previous values; thus, a declaration like `\.{numeric}~\.x'
isn't redundant, since it removes any existing value that \.x may have
had, of whatever type. Incidentally, this declaration doesn't affect
other values like \.{x2} or \.{x2arc} or \.{x.x} that might coexist with~\.x.
\newsection 3. The next lowest level: Expressions.
The declaration `\.{delimiters} \<symbolic token>\<symbolic token>'
declares a pair of tokens to be matching delimiters. For example, the
\.{PLAIN} base says `\.{delimiters}~\.{()}' so that parentheses do
the usual thing. Any distinct symbolic tokens can be defined to act
as delimiters, and many different pairs of delimiters can be
in use simultaneously.
There are eight kinds of expressions in \MF, corresponding to the
eight types numeric, string, etc. The full syntax is quite long,
but most of it falls into a simple pattern: There are four levels
of precedence called the primary level (tightest binding), the
term level (next tightest), the aggregate level (next loosest),
and the expression level (loosest). If $\alpha$, $\beta$, and
$\gamma$ are types, most of the syntax rules are of the following
general form:
\syntaxlines{\<$\alpha$ primary>\is\<$\alpha$ variable>\alt
\<$\alpha$ constant>\cr
\andalso\<left delimiter>\<$\alpha$ expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>\.; \<$\alpha$ expression> \.{endgroup}\cr
\andalso\<operator that takes type $\beta$ to type $\alpha$>
\<$\beta$ primary>\cr
\syntaxbreak
\<$\alpha$ term>\is\<$\alpha$ primary>\cr
\andalso\<$\beta$ term>\<multiplicative operator taking types $\beta$ and
$\gamma$ to type $\alpha$>\<$\gamma$ primary>\cr
\syntaxbreak
\<$\alpha$ aggregate>\is\<$\alpha$ term>\cr
\andalso\<$\beta$ aggregate>\<additive operator taking types $\beta$ and
$\gamma$ to type $\alpha$>\<$\gamma$ term>\cr
\syntaxbreak
\<$\alpha$ expression>\is\<$\alpha$ aggregate>\cr
\andalso\<$\beta$ expression>\<external operator taking types $\beta$ and
$\gamma$ to type $\alpha$>\<$\gamma$ aggregate>\cr}
These schematic rules don't give the whole story, but they give the
general structure of the plot.
The complete syntax appears below, as a set of rules that
can be used as a summary of all the features that \MF provides within
expressions. (However, some of these will not be implemented in the
early versions; and we shall see that others are definable as macros.
Hence the list is both too long and too short.)
\syntaxlines{\<expression>\is
\<boolean expression>\alt
\<string expression>\alt
\<path expression>\cr
\andalso \<pen expression>\alt
\<edges expression>\alt
\<transform expression>\cr
\andalso \<numeric expression>\alt
\<pair expression>\cr}
\noindent Boolean expressions:
\syntaxlines{\<boolean primary>\is
\<boolean variable>\alt \.{true}\alt\.{false}\cr
\andalso\<left delimiter>\<boolean expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>; \<boolean expression> \.{endgroup}\cr
\andalso \.{odd}\<numeric primary>\cr
\andalso \.{undefined}\<variable>\cr
\andalso \.{not}\<boolean primary>\cr
\syntaxbreak
\<boolean term>\is\<boolean primary>\cr
\andalso \<boolean term>\.{and}\<boolean primary>\cr
\syntaxbreak
\<boolean aggregate>\is\<boolean term>\cr
\andalso \<boolean aggregate>\.{or}\<boolean term>\cr
\syntaxbreak
\<boolean expression>\is\<boolean aggregate>\cr
\andalso\<numeric expression>\<relation>\<numeric expression>\cr
\andalso\<string expression>\<relation>\<string expression>\cr
\andalso\<pair expression>\<equality relation>\<pair expression>\cr
\<relation>\is\.<\alt\.{<=}\alt\.>\alt\.{>=}\alt\<equality relation>\cr
\<equality relation>\is\.=\alt\.{<>}\cr}
\noindent String expressions:
\syntaxlines{\<string primary>\is
\<string variable>\alt \<string token>\cr
\andalso\<left delimiter>\<string expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>; \<string expression> \.{endgroup}\cr
\andalso \.{string}\<suffix>\cr
\andalso \.{char}\<numeric primary>\cr
\andalso \.{decimal}\<numeric primary>\cr
\andalso \.{substring}\<pair expression>\.{of}\<string primary>\cr
\andalso \.{jobname}\cr
\syntaxbreak
\<string term>\is\<string primary>\cr
\syntaxbreak
\<string aggregate>\is\<string term>\cr
\andalso \<string aggregate>\.\&\<string term>\cr
\syntaxbreak
\<string expression>\is\<string aggregate>\cr}
\noindent Path expressions:
\syntaxlines{\<path primary>\is
\<path variable>\cr
\andalso\<left delimiter>\<path expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>; \<path expression> \.{endgroup}\cr
\andalso \.{subpath}\<pair expression>\.{of}\<path primary>\cr
\syntaxbreak
\<path term>\is\<path primary>\cr
\andalso\<path term>\<transform>\cr
\syntaxbreak
\<path aggregate>\is\<path term>\cr
\andalso \<pair aggregate>\<direction specification>\cr
\<direction specification>\is\<empty>\alt
\.{\char`\{curl}\<numeric expression>\.{\char`\}}\cr
\andalso\.{\char`\{}\<pair expression>\.{\char`\}}\alt
\.{\char`\{}\<numeric expression>,\<numeric expression>\.{\char`\}}\cr
\syntaxbreak
\<path expression>\is\<path aggregate>\cr
\andalso\<path expression>\<path join>\<path aggregate>\cr
\<path join>\is\.\&\alt\.{..}\alt\.{..}\<tension>\.{..}
\alt\.{..}\<controls>\.{..}\alt\.{..bounded..}\cr
\<tension>\is\.{tension}\<numeric expression>\alt
\.{tension}\<numeric expression>\.{and}\<numeric expression>\cr
\<controls>\is\.{controls}\<pair expression>\alt
\.{controls}\<pair expression>\.{and}\<pair expression>\cr}
\noindent Edges expressions:
\syntaxlines{\<edges primary>\is
\<edges variable>\cr
\andalso\<left delimiter>\<edges expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>; \<edges expression> \.{endgroup}\cr
\andalso \.{cull}\<pair expression>\.{of}\<edges primary>\cr
\syntaxbreak
\<edges term>\is\<edges primary>\cr
\andalso\<edges term>\<transform>\cr
\syntaxbreak
\<edges aggregate>\is\<edges term>\cr
\andalso \<edges aggregate>\.{union}\<edges term>\cr
\syntaxbreak
\<edges expression>\is\<edges aggregate>\cr}
\noindent Pen expressions:
\syntaxlines{\<pen primary>\is
\<pen variable>\cr
\andalso\<left delimiter>\<pen expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>; \<pen expression> \.{endgroup}\cr
\andalso\.{pencircle}\<transform sequence>\cr
\andalso \.{makepen}\<path primary>\<transform sequence>\cr
\syntaxbreak
\<pen term>\is\<pen primary>\cr
\syntaxbreak
\<pen aggregate>\is\<pen term>\cr
\syntaxbreak
\<pen expression>\is\<pen aggregate>\cr}
\noindent Transform expressions:
\syntaxlines{\<transform primary>\is
\<transform variable>\cr
\andalso\<left delimiter>\<transform expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>; \<transform expression> \.{endgroup}\cr
\syntaxbreak
\<transform term>\is\<transform primary>\cr
\andalso\<transform term>\<transform>\cr
\<transform>\is\.{rotated}\<numeric primary>\alt
\.{slanted}\<numeric primary>\alt\.{scaled}\<numeric primary>\cr
\andalso\.{shifted}\<pair primary>\alt
\.{transformed}\<transform primary>\cr
\andalso\.{xmult}\<numeric primary>\alt
\.{ymult}\<numeric primary>\alt
\.{zmult}\<pair primary>\cr
\<transform sequence>\is\<empty>\alt\<transform sequence>\<transform>\cr
\syntaxbreak
\<transform aggregate>\is\<transform term>\cr
\syntaxbreak
\<transform expression>\is\<transform aggregate>\cr}
\noindent Numeric expressions:
\syntaxlines{\<numeric primary>\is
\<numeric variable>\alt\<numeric token>\alt\<internal parameter>\cr
\andalso\<left delimiter>\<numeric expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>; \<numeric expression> \.{endgroup}\cr
\andalso\.{oct}\<string primary>\alt\.{hex}\<string primary>
\alt\.{ord}\<string primary>\cr
\andalso\.{length}\<string primary>\alt\.{length}\<path primary>\cr
\andalso\<pair part specifier>\<pair primary>
\alt\<transform part specifier>\<transform primary>\cr
\andalso\<unary operator>\<numeric primary>\alt\.{normaldeviate}\cr
\<pair part specifier>\is\.{xpart}\alt\.{ypart}\cr
\<transform part specifier>\is \.{xxpart}\alt\.{xypart}\alt
\.{yxpart}\alt\.{yypart}\alt \.{xpart}\alt\.{ypart}\cr
\<unary operator>\is\.+\alt\.-\alt\<numeric token>\cr
\andalso\.{sqrt}\alt\.{mexp}\alt\.{mlog}\alt\.{sind}\alt\.{cosd}\alt
\.{arctand}\alt\.{trunc}\alt\.{uniformdeviate}\cr
\syntaxbreak
\<numeric term>\is\<numeric primary>\cr
\andalso\<numeric term>\<multiplicative op>\<numeric primary>\cr
\andalso\<numeric term>\.[\<numeric expression>\.,\<numeric expression>\.]\cr
\<multiplicative op>\is\.*\alt\./\cr
\syntaxbreak
\<numeric aggregate>\is\<numeric term>\cr
\andalso\<numeric aggregate>\<additive op>\<numeric term>\cr
\andalso\<numeric aggregate>\.{++}\<numeric term>\cr
\<additive op>\is\.+\alt\.-\cr
\syntaxbreak
\<numeric expression>\is\<numeric aggregate>\cr}
\noindent Pair expressions:
\syntaxlines{\<pair primary>\is
\<pair variable>\cr
\andalso\<left delimiter>\<numeric expression>\.,
\<numeric expression>\<right delimiter>\cr
\andalso\<left delimiter>\<pair expression>\<right delimiter>\cr
\andalso\.{begingroup} \<program>; \<pair expression> \.{endgroup}\cr
\andalso\.{point}\<numeric expression>\.{of}\<path primary>\cr
\andalso\.{direction}\<numeric expression>\.{of}\<path primary>\cr
\andalso\.{precontrol}\<numeric expression>\.{of}\<path primary>\cr
\andalso\.{postcontrol}\<numeric expression>\.{of}\<path primary>\cr
\andalso\.{penoffset}\<pair expression>\.{of}\<pen primary>\cr
\andalso\<unary pair operator>\<pair primary>\cr
\<unary pair operator>\is\.+\alt\.-\alt\<numeric token>\cr
\syntaxbreak
\<pair term>\is\<pair primary>\cr
\andalso\<pair term>\<multiplicative op>\<numeric primary>\cr
\andalso\<numeric term>\<multiplicative op>\<pair primary>\cr
\andalso\<numeric term>\.[\<pair expression>\.,\<pair expression>\.]\cr
\andalso\<pair term>\<transform>\cr
\syntaxbreak
\<pair aggregate>\is\<pair term>\cr
\andalso\<pair aggregate>\<additive op>\<pair term>\cr
\andalso\<path aggregate>\.{intersect}\<path term>\cr
\syntaxbreak
\<pair expression>\is\<pair aggregate>\cr}
One of the most important consequences of these rules is that \MF\
always knows the type of the expression it is dealing with. For example,
a \<pair variable> is a variable that has been declared to have type
\.{pair}; such a variable will be recognized as a \<pair expression>
and not as any other kind of expression. There are only a few exceptions:
(1)~A \<pair aggregate> can be a \<path aggregate> when followed by an
empty \<direction specifier>. In such a case, it is considered to be
only a \<pair aggregate> unless the \<path aggregate> interpretation is
mandatory. (2)~A \<boolean expression> like `\.{x=y}' that involves the
equality relation looks something like an equation. It should be
enclosed in delimiters unless it has been preceded by `\.{if}' or
`\.{while}' or something that makes a boolean interpretation mandatory.
\newsection 4. Macro definitions.
\MF's most powerful way to produce new high-level constructs is to make
one token stand for a combination of other tokens. The general way to
make this happen is to say
$$\hbox{\.{def}\<defined variable>\<parameter heading>\.=
\<replacement text>\.{enddef};}$$
this is called a {\sl definition}. For example, the definition
$$\hbox{\tt def -- = - - enddef}$$
simply says that the token `\.{--}' is to be replaced by two
consecutive `\.-' tokens. Here's an even more trivial definition:
$$\hbox{\tt def \char`\\\ = enddef;}$$
it causes a single backslash token to be replaced by nothing at all.
(\MF\ actually has this definition built in, because it makes
the use of \MF\ analogous to the use of \TeX, especially in command
lines when you're running the program.)
More interesting definitions include parameters that are replaced by
arguments when the token appears later. In this way definitions provide
the capabilities of subroutines as well as the features of simple macro
expansion. It's convenient for the sake of brevity to give the general
rules first, and examples later---even though good expository technique
would go the other way; so here's more syntax:
\syntaxlines{\<defined variable>\is\<declared variable>\alt
\<declared variable>\.{@\#}\cr
\<parameter heading>\is\<empty>\alt
\<parameter heading>\<parameter declaration>\cr
\andalso\<function parameter heading>\cr
\<parameter declaration>\is\.(\<parameter type>\<parameter tokens>\.)\cr
\<parameter type>\is\.{expr}\alt\.{text}\alt\.{suffix}\cr
\<parameter tokens>\is\<name>\alt\<parameter tokens>\.,\<name>\cr}
For example, the parameter heading
$$\hbox{\tt (suffix i,j)(text foo)(expr */*)}$$
introduces four parameter
tokens `\.i', `\.j', `\.{foo}', and `\.{*/*}'; they will be
treated specially whenever they occur within the \<replacement
text>.
The \<replacement text> is any sequence of tokens that doesn't
contain an unquoted \.{enddef} token. This rule needs some explanation:
There's a primitive operation (initially called \.{quote}) that
inhibits special interpretation of whatever token follows it in a
replacement text. A few
tokens are treated specially when \MF\ is scanning the replacement text of
a definition, unless they've been quoted:
\smallskip\item{1)}\.{enddef}, which ends the replacement text.
\item{2)}a parameter token (like `\.i' or `\.{*/*}' in the example above),
which is replaced by a special internal token that will tell \MF\ to
substitute an argument when this token is encountered again.
\item{3)}\.{quote}, which disables any special interpretation of the
immediately following token; this token doesn't survive in the
replacement text, unless of course it has been quoted.
\item{4)}\.{\#@}, \.@, and \.{@\#}, which will be replaced respectively by
the prefix, the name, and the suffix of this macro when it's used.
\smallskip\noindent
Rule 4 is only unusual one, but the examples below should make it clear.
The mnemonic for distinguishing \.{\#@} from \.{@\#} is that the \.@ sign
represents where the macro is ``at,'' and the other sign retrieves the
tokens that either precede or follow the ``at'' position.
It may be worthwhile to reiterate the fact that these rules don't
really apply to the specific tokens \.{enddef}, \.{quote}, \.{\#@}, \.@, and
\.{@\#}; they apply to tokens whose meaning (at the time \MF\ is
recording the definition) is the same as the primitive meaning that those
other tokens had when \MF\ was started up. In particular, if the meanings
of \.{enddef}, \.{quote}, \.{\#@}, \.@, and \.{@\#} have changed at the time of
definition recording, no quoting is actually necessary. But some other
token had better have received the meaning of \.{enddef}, or the definition
will never end!
The defined quantity can be any \<defined variable>, not simply a \<name>;
for example, you can define `\.{a[]b@\#}'. What does this mean? Well,
it means that a supposed variable name like `\.{a35b42c.d}' becomes a
macro call instead. The prefix that gets substituted for \.{\#@} in the
replacement text will be `\.{a35}' in this example; the name that gets
substituted for \.@ will be `\.b'; and the suffix that gets substituted
for \.{@\#} will be `\.{42c.d}'. In simpler cases the prefix and suffix
are empty. If there's no \.{@\#} at the end of the \<defined variable>,
the suffix part is always empty. For example, after a definition of
\.{a[]b}, the text `\.{a35b42c.d}' will be interpreted as simply `\.{a35b}'
and \MF\ will not look ahead for a suffix.
One of the important definitions in the \.{PLAIN} base is
$$\hbox{\tt def z@\# = (x@\#,y@\#) enddef};$$
it converts variable names like `\.{z20}' and `\.{z[i]r}' into
`\.{(x20,y20)}' and `\.{x[i]r,y[i]r)}', respectively, making it
possible to give convenient names to the aggregates of a pair
variable without explicitly defining \.z as a pair variable.
Consider also the definition
$$\hbox{\tt def p[]slope=(\#@dx,\#@dy) enddef};$$
this converts, e.g., `\.{p5slope}' into `\.{(p5dx,p5dy)}'.
A defined quantity that has parameters must be supplied with corresponding
{\sl arguments\/} when it is used. The arguments are enclosed in
delimiters (usually parentheses). It's also possible to use a comma
between arguments; in this context a comma can be thought of as an
abbreviation for \<right delimiter>\<left delimiter> with respect to the
delimiter pair that preceded the comma.
The argument corresponding to a parameter of type \.{expr} can be any
\<expression>. This expression is evaluated before it is substituted into
the replacement text, hence there is no need to enclose it in parentheses
in the replacement text; it's like ``call by value'' in a conventional
programming language.
The argument corresponding to a parameter of type \.{suffix} can be any \<suffix>.
Subscripts in that suffix, if any, will have been evaluated and
replaced by (signed) numeric tokens of the corresponding value, before
the argument is actually substituted into the replacement text.
The argument corresponding to a parameter of type \.{text} is any sequence
of tokens that are balanced with respect to the enclosing delimiters.
This means that text arguments cannot be followed by commas. For example,
a list of three arguments can usually be given either as `\.{(a,b,c)}'
or `\.{(a,b)(c)}' or `\.{(a)(b,c)}' or `\.{(a)(b)(c)}'; but only the
second and last of these alternatives is permitted when the second
argument corresponds to a text parameter. Since delimiters need not
be parentheses, a text argument need not be balanced with respect
to parentheses; but it's usually not a good idea to play with
unbalanced parentheses unless you have a really special reason. Text
arguments are not ``evaluated'' at the time of a macro call; they are
simply stored away, and substituted for the corresponding parameter when
it shows up in the replacement text.
Here now are a few more examples, as promised. The first one
is intended to set up a triple of points that represent the position
of a broad pen. For example, `\.{pos(3,20,45)}' will stand for pen
position~3 at which the pen is 20~pixels broad and inclined at an
angle of $45↑\circ$; there will be three points \.{z3}, \.{z3l}, and
\.{z3r}, representing the middle of the pen, its left edge, and its
right edge. Here's one way to define \.{pos} accordingly:
$$\vcenter{\halign{\tt#\hfil\cr
def pos(suffix i)(expr l,theta)=\cr
\ \ \ \ \ z.i.r-z.i.quote l = (l,0) rotated theta;\cr
\ \ \ \ \ z.i = .5[z.i.quote l, z.i.r] enddef\cr}}$$
Now the tokens `\.{pos(4,15,d+90)}' will expand into
`\.{z4r-z4l=(15,0)rotated135;} \.{z4=.5[z4l,z4r]}', if \.{d=45}. It
wouldn't have been necessary to quote any of the appearances of `\.l' if
another name had been chosen for the parameter. For example,
$$\vcenter{\halign{\tt#\hfil\cr
def pos(suffix i)(expr length,theta)=\cr
\ \ \ \ \ z.i.r-z.i.l = (length,0) rotated theta;\cr
\ \ \ \ \ z.i = .5[z.i.l, z.i.r] enddef\cr}}$$
would have been simpler. The \.{quote} operation has been provided
mostly to permit definitions within definitions, not to compensate for
poorly chosen parameter names.
The following example illustrates the use of a text parameter.
$$\vcenter{\halign{\tt#\hfil\cr
def label(text t)=\cr
\ \ \ \ \ forsuffixes \$=t do autolabel z\$ as "point"\&string\$ od\cr}}$$
The expansion of `\.{label(1,[i+1],7a)}' will be
$$\hbox{\tt forsuffixes \$=1,[i+1],7a
do autolabel z\$ as "point"\&string\$ od}$$
and this, in turn, is a macro-like construction that essentially expands into
$$\hbox{\tt autolabel z1 as "point1"; autolabel z2 as "point2";
autolabel z7a as "point7a"}$$
if \.{i=1}, after which the `\.z' macro might expand the text even more.
Going back to the syntax for \<parameter heading>, you'll note that there's
something called a \<function parameter heading> that hasn't yet been
explained. Well, here's the missing syntax:
\syntaxlines{\<function parameter heading>\is\<name>\cr
\andalso\.{function}\<name>\<precedence indicator>\<name>\cr
\<precedence indicator>\is \.*\alt\.+\alt\.{..}\cr}
In this case the macro works just as usual, but its arguments are parsed
like the arguments in expressions; no delimiters are required. For example,
$$\hbox{\tt def ++ function x+y = sqrt(x*x+y*y) enddef}$$
would be a way to define the \.{++} operator. (However, \MF's built-in
\.{++} is much better, because it won't overflow when \.{x*x} or \.{y*y}
are out of range.) Exponentiation can be defined by
$$\hbox{\tt def ** function x*y = mexp(y*mlog x) enddef}$$
but such a definition blows up when $x<0$ and $y=2$. Here's a better way:
$$\vcenter{\halign{\tt#\hfil\cr
def ** function x*y =\cr
\ \ begingroup save t;\cr
\ \ if y=2: t=x*x; % that's the most common case\cr
\ \ elseif x>0: t=mexp(y*mlog x);\cr
\ \ elseif y=trunc y: t=1;\cr
\ \ \ \ if y>=0: for n=1 step 1 upto y do t:=t*x od;\cr
\ \ \ \ else: for n=-1 step -1 downto y do t:=t/x od;\cr
\ \ \ \ fi;\cr
\ \ else: errmessage("Undefined power " \& decimal x \& "**" \& decimal y);\cr
\ \ \ \ t=1;\cr
\ \ fi; t endgroup enddef\cr}}$$
The \<precedence operator> for functions of two arguments is `\.*', `\.+', or
`\.{..}' to indicate how the arguments should be parsed by \MF. If it is
`\.{*}', the first argument is parsed as a term and the second as a primary;
if it is `\.{+}', the first argument is parsed as an aggregate and the
second as a term; if it is `\.{..}', the first argument is parsed as
an expression and the second as an aggregate.
Functions of a single variable have arguments at the primary level
of parsing. For example, plain \MF\ defines rounding as follows:
$$\hbox{\tt def round x = trunc(x+.5) enddef};$$
it's not necessary to put parenthesis around the argument when that
argument is a numeric primary as in `\.{round 1.5u}'. And here's
a slightly more complex example that rounds its argument to a ``good''
value with respect to a given ``pen width'' \.{w[i]}:
$$\vcenter{\halign{\tt#\hfil\cr
def good[] (numeric x)=\cr
\ \ begingroup save t;\cr
\ \ if odd w@: t=(trunc x)+.5;\cr
\ \ else: t=round x;\cr
\ \ fi; t endgroup enddef\cr}}$$
Now, for example, if \.{w3=4.9} and \.{x10=15.2}, the result of
`\.{good3 x10}' will be 15.5.
The following example defines a transform that is like a given one but
fixes the origin:
$$\hbox{\tt def unshifted t = t shifted-((0,0) transformed t) enddef}$$
And here is a way to define the sum of two transforms:
$$\vcenter{\halign{\tt#\hfil\cr
def transum function x+y =
\ \ begingroup save t; transform t;\cr
\ \ (0,1) transformed t = (0,1) transformed x + (0,1) transformed y;\cr
\ \ (1,0) transformed t = (1,0) transformed x + (1,0) transformed y;\cr
\ \ (0,0) transformed t = (0,0) transformed x + (0,0) transformed y;\cr
\ \ t endgroup enddef\cr}}$$
\newsection 4. Commands.
(Still to come.)
\vfill\end